-
Couldn't load subscription status.
- Fork 17
A stable, safe implementation of "Struct Target Features" #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file (and trampoline.rs) contains the main code needed to understand this PR.
| /// See the module level docs [self]. | ||
| /// | ||
| /// We require static lifetimes as this is primarily internal to the macro. | ||
| pub const fn is_feature_subset<const N: usize>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function needs the most careful review, because its correctness is being relied upon for safety.
|
The "glamour shot" of this PR is that given: fearless_simd/fearless_simd_core/src/lib.rs Lines 236 to 241 in ac8cb44
You can run: fearless_simd/fearless_simd_core/src/lib.rs Lines 246 to 255 in ac8cb44
To entirely safely and soundly use Rust's SIMD intrinsics. To help guide review, the core contribution of this PR is a way to talk about target features in the type system. This is implemented through this trait: fearless_simd/fearless_simd_core/src/lib.rs Lines 24 to 50 in ac8cb44
Implementing fearless_simd/fearless_simd_core/src/lib.rs Lines 249 to 255 in ac8cb44
In this example, the SSE x86 functionality for multiplying is proven to be safe, and then executed and ran. Separately, in this PR, we have the functionality for properly using this on the x86_64 (and also plain x86) architectures. This is the contents of the
Every file in that folder (except for mod.rs files) is automatically generated by the binary crate of the |
Also removes unused additional impl support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soundness makes sense to me! Let me know if these TODOs are deliberate, once you've solved the question of whether to split out x86 into x86 and x86_64, and I'm happy to approve
fearless_simd_core/src/lib.rs
Outdated
| //! These examples use [bytemuck](https://crates.io/crates/bytemuck) for this. | ||
| //! | ||
| //! <!-- TODO --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO deliberately left for completing in a later PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was my approximate intention. I'll plan to follow up with that very soon after we land this, but I don't want to potentially block this entire PR on getting those reviewed now.
fearless_simd_core/src/lib.rs
Outdated
| //! | ||
| //! # Crate Feature Flags | ||
| //! | ||
| //! <!-- TODO --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a deliberately left TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's correct. This is something we'd update closer to release time - in particular, the std feature currently does nothing other than existing for forward-compatability, so there aren't actually any meaningful feature flags.
fearless_simd_core/src/lib.rs
Outdated
| /// Note that a function only operating on 128 bytes is probably too small for checking | ||
| /// whether a token exists just for it is worthwhile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: grammar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just removed this block, as it's not really all that helpful.
| /// /// Perform some computation using SIMD. | ||
| /// #[target_feature(enable = "f1,f2")] | ||
| /// fn uses_simd(val: [f32; 4]) -> [f32; 4] { | ||
| /// // ... | ||
| /// } | ||
| /// | ||
| /// let a = [1., 2., 3., 4.]; | ||
| /// let Some(token) = token else { return scalar_fallback(a) }; | ||
| /// | ||
| /// trampoline!(Token = token => "f1,f2", uses_simd(a: [f32; 4]) -> [f32; 4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a minor misunderstanding regarding why the token feature set needs to be declared "f1,f2" here, whilst the feature set is also declared on the uses_simd function?
When would the feature string passed to trampoline and the function target feature strings diverge? On this line what happens if in the trampoline call "f1,f2" is accidentally written as "f1"?
Similarly, I need clarification of the the utility of multiple tokens. E.g. [Token = token, Sse = my_sse] => "f1,f2,sse"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I need clarification of the the utility of multiple tokens. E.g. [Token = token, Sse = my_sse] => "f1,f2,sse"?
Attempt at answering my own question.
The list of tokens provided to trampoline provides an explicit list of permitted/witnessed features. The function passed to trampoline (uses_simd in this example), has required features declared in the target_feature attribute. Thus, trampoline enforces that it is only safe to call this function if the provided tokens contain the subset of required features?
Very very cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, exactly. This is due to our dependency of the target features 1.1 Rust feature, which moves that safety check into the Rust compiler.
And yes, you can use multiple tokens exactly as you describe. I'll see about adding some docs for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To put this another way, declaring the target feature string in the macro body allows us to move the target features from just being an attribute, to being both an attribute but also a const value, which we can then perform validation on.
| //! This abstraction is designed to be combined with target features 1.1, the recent update | ||
| //! in the Rust compiler to allow calling `#[target_feature]` functions safely from within | ||
| //! other `#[target_feature]` functions. | ||
| //! As such, once you have used the [`trampoline!`] macro, you can call any intrinsic in [`core::arch`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am starting to understand the power of this abstraction. Per Stabilize target_feature_11 #134090, it is unsafe to call a function with target_feature declared unless the caller is in a context with those features. Thus the initial call into the target_feature context is unsafe. This trampoline! provides a safe alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "glamour shot" described in comment #108 (comment) makes perfect sense to me now! I really like this PR. Apologies that it's taking me a while to get through. On Monday I plan to go through the code that's outside fearless_simd_core.
Rename `trampoline.rs` to `support.rs` The old name conflicted with the name of the macro, leading to it being harder to find the docs of the macro itself. Remove unneeded reference Remove entire note on 128 bytes being too small The point it was making was: - Fairly hard to explain - Not necessarily true Add a few more test cases Co-authored-by: Taj Pereira <[email protected]>
|
Thanks both for the excellent reviews - even thought they clearly raced, all the comments were very helpful for improving things! |
|
I had another thought after reviewing your other PR. I'm wondering if this is something that trampoline can express. Trampoline is excellent cause you can safely call into SIMD target features from non target features callees. Is trampoline also useful to constrain target features? Imagine that you can use 512bit vectors but there's some particular function that triggers the CPU to downclock. Could you use trampoline to narrow the target features when calling from a callee with more target features into a SIMD function intentionally restricting the features available and avoiding a potential CPU downclock? I don't know how useful this would be in practice but I'm curious. Great work by the way! Edit: on further thought I realize this is a gap in my target features understanding. |
There's a few things to say here, to hopefully help guide your understanding:
|
The core contributions of this PR are:
trampolinemacro, which validates a#[target_feature(enable = "xxx")]string against values of one or more of these, ensuring at compile time that a call to a#[target_feature]function will be safe; and then calling it.The state of this feature is:
The x86-64-v{1,2,3,4} level implementations do not exist/are extremely incomplete.Some docs are missing (these are however not the most critical docs, it's only docs on the groupings of x86 features).There is also an open licensing question, around the docs taken from the Rust reference. My preference would be to copy https://github.com/rust-lang/reference/blob/1d930e1d5a27e114b4d22a50b0b6cd3771b92e31/LICENSE-MIT#L1 into our LICENSE-MIT, which avoids having to make a decision about copyright-ability here.
My proposed next steps are:
For review:
fearless_simd_core/x86/xxx/xxx.rs, as these are entirely automatically generated. The exception isfearless_simd_core/x86/xxx/mod.rs, which are hand-written, but don't have any logic.Discussed on Zulip: #simd > Removing `safe-wrappers`